# Image-to-text conversion
Google.gemma 3 4b It Qat Int4 Unquantized GGUF
A quantized version of the image-to-text model based on Gemma 3 4B, aiming to make knowledge accessible to the public
Image-to-Text
G
DevQuasar
161
1
Ibm Granite.granite Vision 3.2 2b GGUF
Granite Vision 3.2 2B is a vision-language model developed by IBM, focusing on image-to-text tasks.
Image-to-Text
I
DevQuasar
211
1
Thai Handwriting Llm
Apache-2.0
A LoRA-adapted vision-language model based on Llama-3.2-11B-Vision-Instruct, capable of transcribing Thai handwritten text from images.
Image-to-Text
Safetensors Other
T
Aekanun
9
6
Donut Finetune Rvl Cdip
Apache-2.0
Document classification model based on the Donut framework, trained on a small-scale RVL-CDIP dataset
Image-to-Text
Transformers English

D
sitloboi2012
18
0
Pix2struct Screen2words Large
Apache-2.0
A large-scale vision-language model based on the Pix2Struct architecture, fine-tuned specifically for generating UI interface function descriptions
Image-to-Text
Transformers Supports Multiple Languages

P
google
176
19
Image Captioning Portuguese
Apache-2.0
This model converts images into Portuguese descriptions, trained on ViT and GPT2 architectures.
Image-to-Text Other
I
adalbertojunior
17
1
Featured Recommended AI Models